193 research outputs found
Machine Learning for Biometrics
Biometrics aims at reliable and robust identification of humans from their personal traits, mainly for security and authentication purposes, but also for identifying and tracking the users of smarter applications. Frequently considered modalities are fingerprint, face, iris, palmprint and voice, but there are many other possible biometrics, including gait, ear image, retina, DNA, and even behaviours. This chapter presents a survey of machine learning methods used for biometrics applications, and identifies relevant research issues. We focus on three areas of interest: offline methods for biometric template construction and recognition, information fusion methods for integrating multiple biometrics to obtain robust results, and methods for dealing with temporal information. By introducing exemplary and influential machine learning approaches in the context of specific biometrics applications, we hope to provide the reader with the means to create novel machine learning solutions to challenging biometrics problems
Technoscience Art: A Bridge between Neuroesthetics and Art History?
One of the recent and exciting developments in mainstream art history is its confrontation with the cognitive sciences and neurology. This study is based on the problems these disciplines face before they can contribute to each other. We inspect several critical issues resulting from this encounter, especially in the context of the recently developing field of neuroesthetics. We argue that it is the language barrier between the disciplines, rather than any fundamental conceptual divison, that causes the lack of understanding on both sides. Shared terms in arts and neuroscience are elusive, and the different connotations of extant terms in these separate disciplines must be addressed. We propose technoscience art as a ground where joint terminology may be developed, an audience familiar to the concerns of both sides can be formed, and a new generation of scientifically-knowledgeable artists and scientists can interact for their mutual benefit
Registration of 3D Face Scans with Average Face Models
The accuracy of a 3D face recognition system depends on a correct registration that aligns the facial surfaces and makes a comparison possible. The best results obtained so far use a costly one-to-all registration approach, which requires the registration of each facial surface to all faces in the gallery. We explore the approach of registering the new facial surface to an average face model (AFM), which automatically establishes correspondence to the pre-registered gallery faces. We propose a new algorithm for constructing an AFM, and show that it works better than a recent approach. Extending the single-AFM approach, we propose to employ category-specific alternative AFMs for registration, and evaluate the effect on subsequent classification. We perform simulations with multiple AFMs that correspond to different clusters in the face shape space and compare these with gender and morphology based groupings. We show that the automatic clustering approach separates the faces into gender and morphology groups, consistent with the other race effect reported in the psychology literature. We inspect thin-plate spline and iterative closest point based registration schemes under manual or automatic landmark detection prior to registration. Finally, we describe and analyse a regular re-sampling method that significantly increases the accuracy of registration
Neural spike sorting with spatio-temporal features
The paper analyses signals that have been measured by brain probes during surgery. First background noise is removed from the signals. The remaining signals are a superposition of spike trains which are subsequently assigned to different families. For this two techniques are used: classic PCA and code vectors. Both techniques confirm that amplitude is the distinguishing feature of spikes. Finally the presence of various types of periodicity in spike trains are examined using correlation and the interval shift histogram. The results allow the development of a visual aid for surgeons
Video-based estimation of pain indicators in dogs
Dog owners are typically capable of recognizing behavioral cues that reveal
subjective states of their dogs, such as pain. But automatic recognition of the
pain state is very challenging. This paper proposes a novel video-based,
two-stream deep neural network approach for this problem. We extract and
preprocess body keypoints, and compute features from both keypoints and the RGB
representation over the video. We propose an approach to deal with
self-occlusions and missing keypoints. We also present a unique video-based dog
behavior dataset, collected by veterinary professionals, and annotated for
presence of pain, and report good classification results with the proposed
approach. This study is one of the first works on machine learning based
estimation of dog pain state.Comment: 20 pages, 7 figure
Elucidating the Exposure Bias in Diffusion Models
Diffusion models have demonstrated impressive generative capabilities, but
their 'exposure bias' problem, described as the input mismatch between training
and sampling, lacks in-depth exploration. In this paper, we systematically
investigate the exposure bias problem in diffusion models by first analytically
modelling the sampling distribution, based on which we then attribute the
prediction error at each sampling step as the root cause of the exposure bias
issue. Furthermore, we discuss potential solutions to this issue and propose an
intuitive metric for it. Along with the elucidation of exposure bias, we
propose a simple, yet effective, training-free method called Epsilon Scaling to
alleviate the exposure bias. We show that Epsilon Scaling explicitly moves the
sampling trajectory closer to the vector field learned in the training phase by
scaling down the network output (Epsilon), mitigating the input mismatch
between training and sampling. Experiments on various diffusion frameworks
(ADM, DDPM/DDIM, EDM, LDM), unconditional and conditional settings, and
deterministic vs. stochastic sampling verify the effectiveness of our method.
For example, our ADM-ES, as a SOTA stochastic sampler, obtains 2.17 FID on
CIFAR-10 dataset under 100-step unconditional generation. The code is available
at \url{https://github.com/forever208/ADM-ES} and
\url{https://github.com/forever208/EDM-ES}.Comment: under revie
Is Everything Fine, Grandma? Acoustic and Linguistic Modeling for Robust Elderly Speech Emotion Recognition
Acoustic and linguistic analysis for elderly emotion recognition is an
under-studied and challenging research direction, but essential for the
creation of digital assistants for the elderly, as well as unobtrusive
telemonitoring of elderly in their residences for mental healthcare purposes.
This paper presents our contribution to the INTERSPEECH 2020 Computational
Paralinguistics Challenge (ComParE) - Elderly Emotion Sub-Challenge, which is
comprised of two ternary classification tasks for arousal and valence
recognition. We propose a bi-modal framework, where these tasks are modeled
using state-of-the-art acoustic and linguistic features, respectively. In this
study, we demonstrate that exploiting task-specific dictionaries and resources
can boost the performance of linguistic models, when the amount of labeled data
is small. Observing a high mismatch between development and test set
performances of various models, we also propose alternative training and
decision fusion strategies to better estimate and improve the generalization
performance.Comment: 5 pages, 1 figure, Interspeech 202
Geeks and guests: Estimating player's level of experience from board game behaviors
Board games have become promising tools for observing and studying social behaviors in multi-person settings. While traditional methods such as self-report questionnaires are used to analyze game-induced behaviors, there is a growing need to automate such analyses. In this paper, we focus on estimating the levels of board game experience by analyzing a player's confidence and anxiety from visual cues. We use a board game setting to induce relevant interactions, and investigate facial expressions during critical game events. For our analysis, we annotated the critical game events in a multiplayer cooperative board game, using the publicly available MUMBAI board game corpus. Using off-the-shelf tools, we encoded facial behavior in dyadic interactions and built classifiers to predict each player's level of experience. Our results show that considering the experience level of both parties involved in the interaction simultaneously improves the prediction results
- …